[00:00.000 --> 00:04.780]  Thanks for coming to my talk. I'm a first-time speaker at Crypto & Privacy Village, though
[00:04.780 --> 00:10.480]  I've been at Defcon before last year. Regrettably, we couldn't get together this year, but I sure
[00:10.480 --> 00:16.440]  hope I get to see you all in Vegas next year. Today, I'll be talking about COVID-19 apps,
[00:16.440 --> 00:21.580]  and how Norway became worst-in-class in digital contact tracing from a privacy perspective.
[00:22.760 --> 00:29.300]  First off, a short introduction. My name is Ivan Arvesen. I work as a consultant at a Norwegian
[00:29.300 --> 00:37.200]  consultancy called Bouvet in Oslo, Norway. I mainly focus on privacy and security these days,
[00:37.200 --> 00:45.460]  typically via AppSec, pen testing, data PR, advisory, and so on. My roles these days tend
[00:45.460 --> 00:54.440]  to be tech lead, architect, advisory work, etc. And day-to-day, I work on a few projects within
[00:54.440 --> 01:01.300]  critical infrastructure. I'm a huge privacy geek. I've written and spoken about stuff like GDPR
[01:01.300 --> 01:06.800]  and privacy issues in append-only storage, browser fingerprinting, and mass surveillance
[01:06.800 --> 01:13.280]  of bulk metadata collection. And earlier this year, I was a part of the government-appointed
[01:13.280 --> 01:19.360]  expert group that were to evaluate the Norwegian solution for digital contact tracing of COVID-19.
[01:20.740 --> 01:28.580]  Now, before I begin, a few disclaimers. I do not represent the expert group as such in this
[01:28.580 --> 01:35.860]  talk, as it no longer exists, and as I can only really speak for myself, nor any of its members
[01:35.860 --> 01:42.540]  or any other party you mentioned here. I can only talk for myself. So this is my own professional
[01:42.540 --> 01:51.360]  opinion. I do not represent my employer either. Additionally, I'm an engineer, not a doctor, so
[01:51.360 --> 01:58.660]  take any health statements or health-related statements with a grain of salt. Same goes for
[01:59.340 --> 02:07.120]  legal stuff, as I'm not a lawyer. And most of the media coverage I reference here that
[02:07.120 --> 02:14.480]  originates in Norway has been run through Google Translate for the occasion. Additionally, some of
[02:14.480 --> 02:19.820]  the materials here are from or based on the final public report of our expert group, and others are
[02:19.820 --> 02:28.760]  my own opinion, but I'll be sure to point that out. So back in June, Amnesty International published
[02:28.960 --> 02:35.360]  a report naming and shaming the most dangerous contact tracing apps for privacy, where Norway
[02:35.360 --> 02:43.300]  appeared amongst countries like Bahrain and Kuwait. Now, you might ask, what's Norway anyway? So for
[02:43.300 --> 02:53.880]  those unfamiliar, Norway is basically a country in the lands of fjords, oil, hygge, never mind that
[02:53.880 --> 03:03.180]  the book says Danish, black metal, and blue parrots apparently. Long story short, we're a Scandinavian
[03:03.180 --> 03:13.500]  Nordic country, i.e. we're in the northern part of Europe, and we consistently score highly on a
[03:13.500 --> 03:18.840]  bunch of metrics that basically tell you we're doing pretty all right. Stuff like Democracy Index
[03:18.840 --> 03:28.060]  and Better Life Index, Happiness Report, and so on. So additionally, we tend to hold a high level
[03:28.060 --> 03:34.060]  of trust in the government and in public services in general. We're also a highly digitalized
[03:34.060 --> 03:42.980]  country, and we pride ourselves on knowledge and in trusting experts. So how then could
[03:42.980 --> 03:49.150]  this happen in Norway, a liberal western democracy and a human rights champion?
[03:50.820 --> 03:57.820]  Well, a bit of background. As you're all aware, in the beginning of this year, COVID-19
[03:57.820 --> 04:03.820]  was spreading like wildfire throughout the world. And while it had first been reported as another
[04:03.820 --> 04:12.480]  exotic local virus outbreak in China, it was soon seen as a global threat. Now, pretty early on,
[04:12.480 --> 04:19.160]  I think February, we could read early reports on digital contact tracing in some parts of the world.
[04:20.260 --> 04:25.960]  For instance, we had Singapore, which used a mainly Bluetooth-based
[04:27.320 --> 04:32.120]  application they developed themselves, which hit upon a few snags we'll get back to.
[04:33.060 --> 04:37.940]  Then there were South Korea, which used a combination of cell phone data,
[04:37.940 --> 04:43.800]  credit card purchase history, and surveillance cameras. And then there was Israel, where
[04:44.360 --> 04:52.380]  basically the government would have received geolocation from phone companies and were to
[04:52.380 --> 04:58.380]  download information and call records from mobile phones to track and identify. So they basically
[04:58.380 --> 05:07.580]  put the intelligence service on surveilling the population. So these are different approaches,
[05:07.580 --> 05:13.280]  and depending on what vectors and sensors were used, what data were communicated where and to
[05:13.280 --> 05:21.200]  and so on, there were different privacy impacts, obviously. So while digital solutions can
[05:21.200 --> 05:28.560]  obviously potentially be a valuable tool, it might be useful in this case, it was also evident that
[05:28.560 --> 05:33.840]  certain governments cared more about civil liberties and privacy than others.
[05:35.480 --> 05:42.640]  So this is an illustration from Carl Frederik Schoellam and the CDC, illustrating contact
[05:42.640 --> 05:49.140]  tracing, which is a part of the track and trace strategy many countries are taking.
[05:50.840 --> 05:58.560]  So basically, you want to identify, or when someone is diagnosed with COVID-19,
[05:58.560 --> 06:02.620]  you're trying to find out who's been exposed by that person to the disease, right?
[06:03.500 --> 06:08.680]  So this could be household members, friends, or other that this so-called index patient has
[06:08.680 --> 06:15.600]  been in close contact with. Now, this is most commonly done via interviews with diagnosed,
[06:15.600 --> 06:20.120]  like laboratory confirmed patients, and trying to figure out where they've
[06:20.120 --> 06:23.680]  been at what time and with whom during their infectious period.
[06:26.000 --> 06:31.400]  Now, in the end of February, the coronavirus was first registered in Norway.
[06:31.960 --> 06:38.800]  And in early to mid-March, the government implemented what they call the most invasive
[06:38.800 --> 06:44.640]  measures in peacetime, shutting down schools and colleges and universities and hairdressers,
[06:44.640 --> 06:57.680]  shopping centers, and so on. And in March of this year, the Norwegian Public Broadcasting Service
[06:57.680 --> 07:06.160]  stumbled upon a public source code repo on GitHub, which revealed the existence of a digital contact
[07:06.160 --> 07:12.320]  tracing app by a company called Simula Research Laboratory on behalf of the Norwegian Institute
[07:12.320 --> 07:17.200]  of Public Health, which is a government agency under the Ministry of Health and Care Services.
[07:18.440 --> 07:24.680]  Simula had been chosen as a supplier, producing parts of the solution. So other parts of the
[07:24.680 --> 07:31.000]  government would do some integrations and extending existing services and stuff like that.
[07:33.260 --> 07:38.800]  And we were soon able to read about some of the controversial design choices of this app,
[07:38.800 --> 07:46.760]  such as centralized storage, collection of location data, producing long-term anonymized
[07:46.760 --> 07:52.700]  and aggregated data potentially, which made a few alarm bells go off.
[07:53.940 --> 08:01.180]  Because when people talk about anonymized data, what they often mean is de-identified data,
[08:01.180 --> 08:09.760]  which can potentially be re-identified. So it was also made clear that the source code would
[08:09.760 --> 08:18.380]  not be publicly available. So all of this brought about a bit of criticism and public debate,
[08:19.480 --> 08:25.200]  and didn't really bode well. There were many that were critical for many reasons in this.
[08:26.180 --> 08:31.600]  Now, this is what the final app looked like. On the left, you can see a splash screen,
[08:31.600 --> 08:37.460]  then there's the onboarding process, and on the far right, your run-of-the-mill multi-page
[08:37.460 --> 08:48.280]  privacy policy. To break it down, Smittestop was a closed source solution. It required
[08:48.280 --> 08:53.780]  registration and de facto identification of users, because users had to register using their
[08:53.780 --> 09:03.020]  phone number, which in Norway is very tightly connected to a specific person. It's almost
[09:03.020 --> 09:12.720]  impossible to get a phone number anonymously. So it also collects sensor data from multiple
[09:12.720 --> 09:20.380]  sources, both Bluetooth and location services. It uploads data from all the users, all the time,
[09:20.380 --> 09:28.700]  to a centralized storage that the government controls. And there's the use of a static,
[09:28.700 --> 09:35.280]  device-specific identifier when devices running the app communicate. So they identify themselves
[09:35.280 --> 09:43.520]  using the same identifier every time. Now, the degree to which, if any, there is
[09:43.520 --> 09:48.520]  data minimization in such a solution has been questioned by experts in the public debate
[09:48.520 --> 09:59.100]  from the get-go. And any privacy engineer, and indeed many other with a little amount of technical
[09:59.100 --> 10:06.370]  or practical understanding, will quickly see that these design choices have big privacy implications.
[10:07.740 --> 10:15.440]  So, comparatively, Norway was fairly early on in rolling out an app, and the app itself is
[10:15.440 --> 10:21.020]  arguably one of the most invasive ones on the market, at least in a European context, where
[10:21.020 --> 10:26.580]  there are few, if any, other countries that have the same configuration of privacy-impacting factors
[10:26.580 --> 10:38.320]  in their applications. So, the formal basis for processing, in the case of Smittestop, is
[10:40.860 --> 10:50.100]  a dedicated regulation. So, the app is voluntary to use as such, but its formal basis for processing
[10:50.100 --> 10:59.240]  is authorized by the Diseases Act, not consent. So, outside of this, the GDPR, or the Norwegian
[10:59.240 --> 11:08.040]  implementation of GDPR, obviously applies as otherwise. So, the app
[11:12.220 --> 11:18.980]  is different from most other COVID-19 apps, contact tracing apps, in that most other
[11:18.980 --> 11:25.100]  apps do contact tracing, and that's it. While Smittestop has two purposes.
[11:28.320 --> 11:35.060]  It's to notify users of close contact with people that have been infected,
[11:35.840 --> 11:44.840]  but also to collect data that can be used in evaluating interventions that aim to lower
[11:44.840 --> 11:51.280]  infection rates. For instance, looking at public movement patterns, as well as further research
[11:51.280 --> 11:59.760]  analysis. Maybe input to epidemiological models, for instance. But the current app
[11:59.760 --> 12:06.140]  is made in sort of an all-or-nothing fashion, in that users can choose to have their data
[12:06.140 --> 12:12.940]  used for all the app's purposes, or not to use the app at all. Now, it's obviously not
[12:12.940 --> 12:19.920]  ideal to not let users explicitly opt-in for either purpose, nor is it really even best
[12:19.920 --> 12:26.660]  practice. And, not to mention, a potential consequence of implementing the app this way
[12:27.920 --> 12:35.560]  is that users feel that they're not in control of their data, and maybe they don't fully
[12:35.560 --> 12:42.220]  understand what happens to their data in what case. And that may, in turn, hamper user uptake.
[12:42.260 --> 12:48.560]  So, you might not get as many users installing and using the app, which would lead to a lowered
[12:48.560 --> 12:59.020]  effect of or from the app. About location data. So, what one is interested in when performing
[12:59.020 --> 13:05.400]  contact tracing is really whom met whom. The identity of either party, or the location
[13:07.060 --> 13:15.900]  of contact, is not necessarily relevant to prove contact itself. So, you thus don't necessarily
[13:15.900 --> 13:21.020]  need to know who the involved parties are, or where the contact took place.
[13:21.620 --> 13:28.460]  But the argument made for the use of location data, in the case of Smittstop,
[13:28.460 --> 13:35.400]  is that they wanted to attempt to compensate for the lack of data quality as a consequence of
[13:35.400 --> 13:41.620]  Bluetooth API limitations at the time. Because Bluetooth wouldn't work reliably in the background
[13:41.620 --> 13:50.220]  on iOS, and whereas Android might just kill background processes that continuously use
[13:50.220 --> 13:56.740]  Bluetooth or location services for energy or privacy reasons. Now, there's also another
[13:56.740 --> 14:05.580]  factor here, which is that Norway, being a country where people have a lot of purchasing power,
[14:05.580 --> 14:13.500]  iOS has more of a market share here than other places. So, that's a factor here as well.
[14:15.120 --> 14:21.260]  But on the other hand, GPS or location services typically has an accuracy of 3 to 10 meters
[14:21.260 --> 14:29.840]  under ideal conditions, meaning outdoor usage. So, you know, the quality of location
[14:30.700 --> 14:38.480]  service data from indoor use is uncertain, let's say.
[14:39.280 --> 14:44.640]  So, a proper and transparent evaluation of all the possibilities available here, then,
[14:44.640 --> 14:50.960]  might include something like looking at how big of a problem the current, or then,
[14:50.960 --> 14:57.520]  API limitations were in practice. Like, could you get by with spotty Bluetooth data, i.e. data only
[14:57.520 --> 15:05.460]  from when users were actively using the application? And that's probably a no, because
[15:05.460 --> 15:10.700]  then you'd always only have data from when users were unlocking their phone and having the app in
[15:10.700 --> 15:17.420]  the foreground. Then there's the Singaporean approach, where they, in practice, implemented
[15:17.420 --> 15:26.000]  like a faux sleep mode. Well, this necessitated keeping the app in the foreground, but
[15:26.460 --> 15:31.600]  you could dim the screen when the device was rotated, like placed upside down, so as to not
[15:31.600 --> 15:39.320]  totally drain the battery. So that's something. And that also lets you keep getting
[15:40.080 --> 15:46.020]  Bluetooth data continuously. Or you could collect location data, which is personally
[15:46.020 --> 15:53.080]  identifiable information, which is what we did in the Norwegian case. And then you'd have to think,
[15:53.080 --> 15:57.560]  how do the privacy implications of the respective alternatives here size up against each other and
[15:57.560 --> 16:04.960]  against the issue at hand? And another thing we have to mention here too is that,
[16:04.960 --> 16:12.700]  though it has been mentioned in many instances, that the data is anonymous in several contexts.
[16:12.980 --> 16:18.720]  This is really incorrect, because by virtue of being personal and identifiable information,
[16:18.720 --> 16:26.880]  location data can, by definition, not be anonymous. Because location data can, in itself,
[16:26.880 --> 16:32.260]  reveal a person's identity. That means there is no such thing as anonymous location data
[16:32.260 --> 16:39.240]  on an individual basis. Whereas in aggregated datasets, you can have certain quantifiable
[16:39.240 --> 16:45.640]  guarantees about the degree of privacy, for instance, via k-anonymity, differential privacy.
[16:46.600 --> 16:52.980]  But this gets complicated pretty quickly for a variety of reasons, such as temporal correlation
[16:53.800 --> 17:04.880]  or re-identification by combining data sources. But in practice, location data is only really
[17:05.040 --> 17:12.920]  a clear functional requirement in the case of monitoring public movement. And that's the other
[17:12.920 --> 17:25.440]  purpose, not contact tracing, but collecting data. About centralized storage. So when we talk about
[17:25.440 --> 17:30.640]  centralized storage in the context of contact tracing apps, we usually mean systems that are
[17:30.640 --> 17:35.560]  based on collection that's continuously uploaded to a central server that holds all of everyone's
[17:35.560 --> 17:41.820]  data. This is in contrast with decentralized systems, where every user's data is stored on
[17:41.820 --> 17:48.740]  their device until it's needed. And one should also note that the most popular decentralized
[17:48.740 --> 17:56.620]  solutions of digital contact tracing are not distributed, and that they still use a central
[17:56.620 --> 18:03.660]  server as a communications channel of sorts, as opposed to purely peer-to-peer communication.
[18:03.660 --> 18:11.620]  So for instance, a central server might hold a list of identifiers from people who are sick
[18:11.620 --> 18:17.920]  in the last two weeks, and then other apps poll the server regularly and then check locally if
[18:17.920 --> 18:23.700]  they've been in contact with any of the publicly known infectious or infected keys.
[18:26.040 --> 18:32.380]  The argument made in favor for data centralization in the case of the Norwegian app is that
[18:33.600 --> 18:41.060]  augmentation of user data with data from other users was needed in order to perform analysis.
[18:43.460 --> 18:49.160]  It's also a prerequisite, of course, for the purpose of looking at movement patterns,
[18:49.160 --> 18:55.860]  like collecting the data, the second purpose of the app, or to do other unspecified research
[18:56.700 --> 19:05.100]  on it. But a centralized data store is really, in principle, a defining factor when dealing
[19:05.100 --> 19:11.580]  with private data in this case, because its very existence makes misuse, function creep,
[19:11.580 --> 19:17.100]  leakage, and so on possible in a way that a decentralized solution just plainly doesn't.
[19:17.100 --> 19:25.260]  As in, you can't lose or abuse data you don't have. Now, alternative sources to
[19:26.120 --> 19:33.380]  the aggregated long-term data they hope to collect here may already exist, such as the
[19:33.380 --> 19:40.020]  data that Norwegian telcos had already provided in aggregate form, and that had been used for
[19:40.020 --> 19:48.840]  the same purpose. So they had taken cell site data, basically, and had some sort of coarse
[19:51.200 --> 19:59.500]  estimation of population density at points in time. And the government had used that for
[19:59.500 --> 20:08.340]  this very purpose already. The downside might be that that data might not be as precise
[20:08.340 --> 20:13.500]  as location data that you could get directly from devices, because resolution would depend
[20:13.500 --> 20:21.560]  on many factors, like cell site density and so on. But the privacy cost of uploading every user's
[20:21.560 --> 20:27.260]  location and movement, as well as who they've met and timestamps for all these events,
[20:27.980 --> 20:35.060]  that's undoubtedly much larger than just uploading what data is needed when you think you need it.
[20:35.060 --> 20:42.260]  For instance, prompting users to upload their movements, or even just use Bluetooth data only
[20:43.180 --> 20:47.580]  once a person has been in contact with someone who's positively diagnosed.
[20:49.080 --> 20:55.360]  There's also, of course, the risk of function creep, like the data could be used for something
[20:55.360 --> 21:04.140]  we haven't thought about as of yet, or misuse in case of a regime change.
[21:04.620 --> 21:11.120]  Additionally, if one didn't think the thought oneself, there were already publicly developed
[21:11.120 --> 21:16.240]  decentralized protocols back then, so there were lots of materials for inspiration.
[21:17.360 --> 21:24.600]  Now, this is an illustration of one of those privacy-first contact tracing schemes, so-called,
[21:24.600 --> 21:32.280]  which Nikki Case has generously donated to the public domain. I think what's illustrated here is
[21:32.280 --> 21:39.760]  based on DP3T. But basically, Alice's phone broadcasts random messages every few minutes,
[21:39.760 --> 21:44.260]  and then when she sits next to Bob, their phones exchange messages.
[21:45.080 --> 21:48.600]  Both their phones remember what they have said and what they've heard, i.e.,
[21:48.600 --> 21:58.180]  what they've transmitted, sent, and received for the past 14 days. And if Alice gets COVID,
[21:58.400 --> 22:05.920]  she sends her messages to a central server or a hospital. Now, you can't identify her from
[22:05.920 --> 22:13.920]  this list of what messages she has sent, and this can be added to the big public list of
[22:13.920 --> 22:22.660]  messages from everyone who's been diagnosed. So then, Bob's phone can query this central place
[22:22.660 --> 22:29.200]  for a list of everyone who's been... or keys from messages from everyone who's been
[22:29.200 --> 22:36.420]  infected for the last 14 days. And if his phone recognizes enough of these messages,
[22:36.420 --> 22:41.060]  then he can isolate. So that's the gist of it.
[22:41.720 --> 22:47.560]  Now, when it comes to data integrity and user traceability,
[22:47.560 --> 22:52.980]  the use and communications of static device identifiers makes it possible to track or
[22:52.980 --> 23:00.480]  impersonate others, to trace users in limited partial leaks, and so on. So this is a bad one,
[23:00.480 --> 23:06.320]  because just about every other proposed solution I've seen, like both protocol specification and in
[23:06.320 --> 23:15.980]  existing apps, use rolling identifiers in one form or another. So there's also the fact that
[23:15.980 --> 23:23.760]  data was temporarily stored unencrypted in a local database on devices before uploading
[23:24.480 --> 23:30.080]  device data to the central server, which made it possible to inject or modify data before
[23:30.080 --> 23:38.360]  uploading it. Now, that means that data integrity, at least until the point where this was fixed,
[23:38.360 --> 23:44.700]  cannot be guaranteed. Additionally, the application connects to a server using
[23:45.460 --> 23:50.740]  what looks like an everlasting connection string and no other session handling.
[23:52.420 --> 23:59.800]  Now, in order to use the application, users had to register using their phone number.
[23:59.800 --> 24:06.460]  In Norway, this is de facto identifying yourself, because it's almost impossible to get a phone
[24:06.460 --> 24:11.580]  number without identifying yourself. So there is a connection there, which is even public in many
[24:11.580 --> 24:18.020]  cases, i.e. you can search for a name or a phone number and get the match on public services here.
[24:19.640 --> 24:25.780]  So functionally, there's no real need to identify any involved party, because even in contact
[24:25.780 --> 24:32.900]  tracing, users could be notified by the application when a contact has been diagnosed
[24:33.640 --> 24:41.340]  with COVID by health authorities. Because one could argue that registration is a mechanism
[24:41.340 --> 24:49.640]  that protects against bogus uploads to some extent. But this, in addition to protection of privacy,
[24:49.640 --> 24:55.460]  is in a way already baked into many of the decentralized approaches that demand human
[24:55.460 --> 25:02.280]  intervention before any upload takes place. So for instance, I think in the case of DP3T,
[25:02.280 --> 25:09.560]  they actually distribute upload codes so that when you're diagnosed, the doctor gives you an
[25:09.560 --> 25:15.860]  upload code that you can use to upload your data, i.e. the keys from your infected period
[25:15.860 --> 25:23.160]  to the central datastore. And many of these other approaches also let users
[25:24.280 --> 25:30.300]  choose specifically what time spans to share, or alternatively edit out some time spans to
[25:30.300 --> 25:38.500]  not share them for any reason. Also, Smith to Stop was found to be uploading a bunch of analytics
[25:38.500 --> 25:44.080]  data, including potentially fingerprintable information, on just about any interaction
[25:44.080 --> 25:49.700]  the users do with the application, i.e. from the get-go, from the moment you open your apps,
[25:49.700 --> 25:56.570]  analytics are firing, without telling users this. And this was not stated in the privacy policy,
[25:57.460 --> 26:02.380]  and users could not choose whether they want to upload this data or not.
[26:05.380 --> 26:11.260]  There are also some legal implications, I think, but note I'm not a lawyer, so you'll need to
[26:11.800 --> 26:17.860]  hear the reflections with this in mind. So the regulation I mentioned earlier, that was
[26:18.480 --> 26:24.460]  the formal basis for processing of the data, mentions that health and location data collected
[26:24.460 --> 26:31.200]  for this purpose cannot be shared with law enforcement or other parties. But Bluetooth
[26:31.200 --> 26:42.180]  data, however, isn't mentioned. Now, I interpret this as saying that sharing of Bluetooth data
[26:42.180 --> 26:51.300]  is permitted, because this would mean that the parties that the data is shared with
[26:51.300 --> 26:56.300]  could be able to, for instance, build social graphs of the data subjects.
[26:56.820 --> 27:02.580]  Though the regulation puts in place certain limitations, like including a sunset clause,
[27:03.280 --> 27:09.940]  the regulation also states that it can be changed at any time by the government via a new regulation.
[27:12.220 --> 27:18.980]  There's also the fact that data was stored on Microsoft servers in Ireland,
[27:19.780 --> 27:30.660]  and Ireland, there's already precedence for delivering user data to American authorities
[27:30.660 --> 27:37.720]  there, which can be done via the Cloud Act, I think, where U.S. governments can demand and
[27:37.720 --> 27:46.910]  secretly obtain data stored on the servers of American providers, even abroad. Now,
[27:47.810 --> 27:54.090]  another consequence of what the Norwegian solution ended up looking like is that
[27:55.750 --> 27:59.790]  it would be hard, if not impossible, to at least automatically achieve
[27:59.790 --> 28:05.410]  data interoperability in collaboration with other European countries,
[28:05.410 --> 28:13.370]  because most of these have already implemented or will implement solutions based on Apple and
[28:13.370 --> 28:23.210]  Google's new APIs, and or that are compatible with other protocols, such as DP3T. Now,
[28:23.210 --> 28:28.830]  other countries' contact tracing systems will, therefore, in theory, be able to register contact
[28:29.530 --> 28:35.230]  events that involve citizens of other countries, and or persons using other apps even,
[28:35.230 --> 28:39.810]  potentially, including apps produced by other countries' health officials.
[28:41.550 --> 28:46.130]  And here's a slide for Miscellany.
[28:47.410 --> 28:53.130]  A few points about the app is that the publicly available DPIA
[28:54.450 --> 29:00.590]  appears to not seriously consider alternative approaches and implementation,
[29:01.410 --> 29:10.830]  nor does it really consider malicious use of the data by the legitimate party itself,
[29:10.830 --> 29:17.110]  or data breach, or data leakage, other than via security features of the mobile app.
[29:17.630 --> 29:23.430]  And some of the probabilities stated in the risk assessment seem optimistic.
[29:25.330 --> 29:30.690]  As I mentioned, using a static identifier that's never rotated is obviously a bad idea
[29:30.690 --> 29:38.810]  and makes it possible to track users. And there's also something to be desired in transparency here.
[29:39.590 --> 29:46.430]  Both the purpose of the application, which wasn't very well communicated in the app itself,
[29:46.430 --> 29:53.030]  as well as just how the data was planned to be anonymized and aggregated for long-term use.
[29:53.030 --> 29:57.210]  That should be clearly and specifically communicated to the public, I think.
[29:59.370 --> 30:06.670]  But the anonymization process was not finished during our expert group evaluation of the app,
[30:07.490 --> 30:11.910]  other than involving various forms of aggregation.
[30:14.270 --> 30:20.690]  So all we can really say about that is we had some meetings with some stakeholders,
[30:20.690 --> 30:23.650]  but no public information on that.
[30:25.890 --> 30:33.090]  If the code were open-sourced, the public would be able to verify the functionality of the app,
[30:33.090 --> 30:39.990]  as opposed to depending on security by obscurity, or as opposed to trusting,
[30:39.990 --> 30:43.490]  for instance, the expert group and their interpretations.
[30:44.250 --> 30:53.010]  Now, the fact that the functionality used to bind phone numbers to the cloud device ID
[30:53.010 --> 31:00.870]  is implemented using a so-called preview feature, which the supplier says one should not use to
[31:00.870 --> 31:06.590]  process personal data or any other data that's subject to heightened compliance requirements.
[31:06.590 --> 31:13.390]  That's obviously not great. This was, I think it was B2C functionality in Azure in Ireland,
[31:13.390 --> 31:20.770]  or some such. But there were also various logging and compliance issues, such that
[31:21.310 --> 31:26.770]  users weren't able to see any data about their Bluetooth contacts in the audit solution.
[31:27.110 --> 31:31.530]  They couldn't access logs from health authorities or view audit logs
[31:31.530 --> 31:36.770]  after requesting deletion of their data, which could obviously be easily decoupled.
[31:40.110 --> 31:45.310]  Additionally, the contact analysis code was very complicated and complex,
[31:45.310 --> 31:54.970]  which means it was both, you know, kind of hard to understand what was actually going on there.
[31:54.970 --> 32:01.690]  And it entails like low quality and a maintainability context, and had weaknesses
[32:01.690 --> 32:14.730]  both in implementation and in software methodology. And finally, the app also used
[32:15.270 --> 32:24.190]  text messages, SMS, to notify users if they had been in contact with someone infected with COVID.
[32:27.980 --> 32:34.260]  Now, it's not really known whether digital contact tracing is even a viable solution
[32:34.260 --> 32:40.440]  to the problem at hand. It's unknown what value it can bring, and even if it's feasible at all.
[32:41.180 --> 32:46.320]  So most scientists with direct experience, like the ones involved with the Singaporean
[32:46.320 --> 32:53.900]  program, they claim that digital contact tracing is at best a complementary addition to manual
[32:53.900 --> 32:59.000]  contact tracing. Then there's the fact that the Norwegian app is not in accordance with
[32:59.000 --> 33:04.180]  common European guidelines, such as the EU Commission's recommendations for these sorts
[33:04.180 --> 33:10.480]  of apps, the EU resolution on coordinated work against COVID-19, nor the guidelines from the
[33:10.480 --> 33:18.740]  European Data Protection Board. You know, surveilling movements and contacts of all
[33:18.740 --> 33:24.140]  users of the app is an extremely invasive measure. And as the effectiveness and usefulness
[33:24.140 --> 33:30.800]  of this system is not clear, the proportionality of this measure is questionable at best. One might
[33:30.800 --> 33:38.340]  expect some sort of explanation as to why or how this would be effective and why this outweighs
[33:38.340 --> 33:44.060]  the invasiveness of these actions, but this remains to be seen pretty much.
[33:46.800 --> 33:52.640]  Now, as I said earlier, the app is an open source, and this is because it was claimed that
[33:52.640 --> 34:00.300]  while open sourcing the code would, in the long run, potentially lead to better security,
[34:00.300 --> 34:05.620]  but in a shorter perspective, it might lead the attackers or would-be attackers to
[34:05.620 --> 34:13.200]  be able to exploit any vulnerabilities before anyone else could see them or fix them or so on.
[34:13.200 --> 34:17.230]  So that's the argument that was made to keep the app closed source, basically.
[34:21.260 --> 34:30.920]  And about anonymity in the long term, you know, there was a quote, a comment on our final report
[34:30.920 --> 34:38.500]  where we commented on the state of anonymity in the long-term data, which would be aggregated
[34:38.500 --> 34:46.060]  and anonymized so-called. They responded that the report also has a recommendation of anonymization
[34:46.060 --> 34:56.880]  of data for analysis purposes through so-called differential privacy. The Norwegian Institute of
[34:56.880 --> 35:04.100]  Public Health has at this point already developed an elaborate system for anonymization that is,
[35:04.100 --> 35:10.440]  in their view, will have an equally anonymizing effect as so-called differential privacy,
[35:10.440 --> 35:16.300]  but which is easier to implement, communicate, and doesn't lose data quality to speak of.
[35:17.000 --> 35:26.520]  Now, while that statement makes sense syntactically, I guess we'll leave evaluating
[35:26.520 --> 35:32.400]  the logic of it as an exercise for the listener, right? Is it possible to somehow deliver
[35:32.400 --> 35:38.320]  anonymization to the same extent as differential privacy, but without the formal guarantees?
[35:38.760 --> 35:50.200]  You decide. There are also a bunch of potential possible attacks here, like relay attacks,
[35:50.200 --> 35:55.500]  tracking attacks, where if many people collaborated on this, you'd basically have
[35:55.500 --> 36:01.540]  distributed surveillance. And this is obviously possible because of the static identifiers used
[36:01.540 --> 36:12.820]  by the application. Or you can map infection. And you could potentially, from the long-term data,
[36:12.820 --> 36:21.700]  the anonymized and aggregated data, so-called, re-identify people, potentially. And also
[36:21.700 --> 36:31.320]  potential risk of data theft, leak, or misuse, which this risk is obviously magnified because
[36:31.320 --> 36:37.760]  there's a state actor that's in control of all of this data. So the potential for abuse,
[36:37.760 --> 36:44.300]  if it were to be abused, is obviously much higher. Now that we've discussed this app somewhat,
[36:45.020 --> 36:49.280]  let's take a look at how events unfolded earlier this year.
[36:51.000 --> 36:58.720]  Back in early April, after the public had been informed that this app was in development,
[36:58.720 --> 37:05.340]  and it was going to look like what we just discussed. As a consequence of the criticism,
[37:05.340 --> 37:11.760]  probably, as well as in order to increase transparency and so on, the Ministry of Health
[37:11.760 --> 37:17.240]  and Care Services appointed a group of experts based on recommendations from an organization
[37:17.240 --> 37:25.640]  called IKT Norway. And none of the participants were connected to any of the other involved parties,
[37:26.360 --> 37:33.060]  obviously. But we were basically tasked with evaluating all of the involved components
[37:33.060 --> 37:41.000]  in the solution. This includes mobile apps on respective platforms, a backend encompassing
[37:41.000 --> 37:48.720]  many components, including analysis in order to figure out what's a contact event,
[37:48.720 --> 37:53.820]  and reporting code, and connected services running in the cloud, integrations,
[37:53.820 --> 37:58.200]  and new solutions in existing government systems.
[38:02.600 --> 38:08.580]  We had mostly direct access to all the source code repositories, so we basically just dove into
[38:08.580 --> 38:15.680]  the code and looked at it, and also had a bunch of meetings with all the stakeholders. But
[38:16.600 --> 38:24.080]  our mandate said that we had to deliver an open report, like a public report,
[38:24.080 --> 38:28.180]  where we would assess whether security and privacy are properly taken care of,
[38:28.180 --> 38:33.300]  and also that we communicate with them and tell them about any vulnerabilities,
[38:33.300 --> 38:36.560]  we found any issues, discuss that with them, and so on.
[38:39.460 --> 38:46.720]  So, we delivered a preliminary report after four or five days, and because of that very limited
[38:46.720 --> 38:57.240]  time, we had to limit ourselves to just the smartphone apps and a select part of the backend,
[38:57.240 --> 39:06.540]  and only the technical security aspects. And also, because of the fact that not all the parts of this
[39:06.540 --> 39:12.140]  entire system were finished at this point, or had even been started at that point,
[39:12.140 --> 39:21.780]  we refrained from commenting on privacy. We waited until we'd seen the entirety, or enough.
[39:22.400 --> 39:31.280]  So, deletion of data was not implemented either at this point, but long story short,
[39:31.280 --> 39:35.360]  we pointed to some data integrity issues and the static ID, obviously,
[39:35.360 --> 39:40.360]  as one of the main reasons that the apps shouldn't be launched as is,
[39:41.020 --> 39:48.360]  you know, since there were issues and you'd have to view data as potentially compromised or bad.
[39:48.360 --> 39:56.100]  So, then we recommended that maybe you could do a soft test in a couple of test municipalities,
[39:57.580 --> 40:00.840]  but no more than that would be responsible in our view.
[40:01.640 --> 40:05.860]  Still, we recommended a lot of improvements, most importantly,
[40:05.860 --> 40:09.960]  related to integrity and scalability of the system.
[40:11.160 --> 40:20.980]  What happened then was the app was launched to the entire country at the same time.
[40:22.580 --> 40:30.200]  Well, it was called an evaluation or kind of a trial, right? But it was really launched
[40:30.200 --> 40:34.360]  countrywide. Everyone could go to the app store and download this app and register and
[40:34.360 --> 40:43.000]  upload data. What was limited to a couple of select test municipalities was the
[40:44.360 --> 40:51.580]  tracing functionality. So, they only provided notification services
[40:52.660 --> 40:56.500]  based on the data from the system to the people in the test municipalities,
[40:56.500 --> 41:05.920]  whereas everyone could and would upload data. And some of the reason might be because some of
[41:05.920 --> 41:13.140]  the issues we reported in our preliminary report, but the app was promptly reversed engineered and
[41:13.140 --> 41:20.720]  inspected by a critical tech community. And there were also, as I mentioned on the previous slide,
[41:20.800 --> 41:28.460]  a few scalability issues that we pointed out, which led to the fact that the backend didn't
[41:28.460 --> 41:34.540]  quite scale so that when so many people in the country tried to download and register at the
[41:34.540 --> 41:42.480]  same time, the backend just couldn't handle the pressure, basically. Also, the fact that they
[41:42.480 --> 41:51.860]  hadn't had time to focus on energy efficiency led to a bunch of low user scores and negative reviews
[41:51.860 --> 42:01.080]  and so on. Shortly thereafter, Google and Apple announced their API collaboration, which would fix
[42:01.080 --> 42:07.740]  all the Bluetooth issues in the platform, basically, which would enable you to do
[42:09.380 --> 42:15.700]  Bluetooth activity in a background app, leaving no real reason
[42:16.440 --> 42:20.660]  to use location services for the purpose of contact tracing,
[42:21.840 --> 42:27.600]  if you agree that Bluetooth data is enough to do contact tracing, that is.
[42:28.800 --> 42:36.180]  But this new API demands that you don't use location services for access to the API.
[42:38.480 --> 42:46.360]  And after that, a large amount of the Norwegian tech community mobilized and
[42:47.920 --> 42:53.740]  published a petition. Over 300 professionals in security and privacy and tech in general
[42:54.400 --> 42:58.220]  basically stated that we don't have to choose between privacy and health,
[42:58.220 --> 43:02.820]  and that we have the knowledge and technology to implement a better solution,
[43:03.700 --> 43:14.040]  which was met with some scorn from, well, the supplier mainly.
[43:17.520 --> 43:26.580]  So when we delivered our final public report on May the 20th, this report included some findings
[43:26.580 --> 43:31.920]  and recommendations. Now, most of the findings I've mentioned already, but they included
[43:31.920 --> 43:36.200]  aggressive analytics, static identifiers and Bluetooth contact,
[43:36.200 --> 43:40.600]  eternal connection strings, using preview features for personal data,
[43:40.600 --> 43:47.840]  limitations in auditing solutions, data deletion, also deleting the audit logs,
[43:47.840 --> 43:53.620]  quality issues in the contact analysis code, as well as using SMS, which is an insecure channel,
[43:53.620 --> 44:02.540]  as a notifications channel. Now, in evaluating whether security and privacy was properly handled,
[44:02.540 --> 44:07.700]  our evaluation concluded no on both accounts, unfortunately.
[44:09.320 --> 44:16.560]  And then we delivered a few recommendations, which were basically clarifying the regulation,
[44:16.560 --> 44:21.400]  splitting the app in two based on purpose, so one for contact tracing and one for
[44:22.100 --> 44:29.160]  the data collection for those who wanted to contribute to that, to remove all data that
[44:29.160 --> 44:33.600]  they didn't need at any point in time, so basically work towards more of a data minimization
[44:34.000 --> 44:40.660]  thing, to implement differential privacy in their data aggregation plans,
[44:41.440 --> 44:49.380]  to consider to rewrite towards a more decentralized solution, and to potentially
[44:49.380 --> 44:55.800]  implement differential privacy before uploading user data, like differential privacy locally on
[44:55.800 --> 45:02.500]  the device, and also make as much of the source code as possible open source so that people could
[45:02.500 --> 45:09.580]  verify the functionality for themselves, and to regularly re-evaluate.
[45:11.460 --> 45:19.440]  Now, the aftermath was pretty interesting. At the same government press conference that we
[45:20.040 --> 45:24.960]  delivered our conclusions on, the Institute of Public Health basically just disagreed with
[45:24.960 --> 45:31.260]  our conclusion, and then the supplier, Simula Research Laboratory,
[45:31.680 --> 45:38.160]  basically just wrote a blog post publicly attacking the expert group,
[45:40.100 --> 45:46.060]  and questioning our integrity, claiming that our conclusions were based on personal opinions,
[45:46.060 --> 45:51.520]  that our recommendations were politically motivated, and so on. So here you can see the original
[45:52.460 --> 45:58.220]  blog post translated by me from Norwegian to English, though they've since edited it
[45:58.220 --> 46:05.400]  exchanging political for personal a few times. But nevertheless, questioning the motives of an
[46:05.400 --> 46:10.200]  impartial external group tasked with evaluating your work in this way, kind of concerning
[46:10.200 --> 46:18.780]  nonetheless, I think. But then again, this is another quote from some higher up at the same
[46:18.780 --> 46:24.480]  company. There are many countries I think should not use the Norwegian solution, precisely because
[46:24.480 --> 46:29.200]  they don't have a well-regulated democracy. They don't have strong privacy interests and
[46:29.200 --> 46:37.700]  governments to keep watch. Now, following this logic in the Norwegian app, privacy would then
[46:37.700 --> 46:44.860]  by definition not be handled responsibly, as any privacy guarantees would be contingent on trust,
[46:44.860 --> 46:51.820]  which isn't much of a guarantee. Key point here, data protection and privacy are two different
[46:51.820 --> 47:02.820]  things. After this, the parliament actually got involved and decided that they'd have to split
[47:02.820 --> 47:09.280]  the app based on purpose. So one app for contact tracing and another for data collection for those
[47:09.280 --> 47:17.320]  who want to contribute to that. Then the Norwegian Data Protection Authority concluded that the degree
[47:17.320 --> 47:22.580]  of privacy invasiveness was not justified. Now, this was based on both things like data
[47:23.600 --> 47:29.480]  minimization issues, but also on the current situation in Norway with a low amount of
[47:30.000 --> 47:39.820]  infection, I guess. The health authorities, i.e. the Institute of Public Health, then chose to
[47:39.820 --> 47:46.280]  stop all data collection and to delete all the data they already had. And the next day, Amnesty
[47:46.280 --> 47:51.920]  International stated that they found the Norwegian app to be amongst the most dangerous tracing apps
[47:52.440 --> 47:59.020]  for privacy, which was where we started out. Then international media picked up on this,
[47:59.660 --> 48:04.700]  went around New York Times, BBC, and all those places.
[48:07.320 --> 48:10.080]  Now, what can we learn from all of this?
[48:11.460 --> 48:15.320]  Well, you don't want to be in the position of trying to figure out this stuff while the world's
[48:15.320 --> 48:24.180]  crumbling around you, I think. But I think we need to be able to still stick to our principles,
[48:24.180 --> 48:30.120]  especially in times of crises. Because privacy isn't only a compliance issue nor a subset of
[48:30.120 --> 48:38.160]  security. And it's pretty problematic, I think, that the supplier has consistently been led
[48:38.160 --> 48:44.460]  to do all the communications and just refuse to listen to or even understand criticism.
[48:45.740 --> 48:51.400]  And basically argued that the solution works in Norway because we have such a high degree
[48:51.400 --> 49:01.340]  of trust in the state. Because this misses the point entirely. And as to what happens next,
[49:01.340 --> 49:06.980]  there were talks of experimenting with the Apple Google API. But now that everything's on ice,
[49:06.980 --> 49:17.840]  who knows? So quick recap. Norway is hence worst in class in contact tracing for COVID-19.
[49:17.840 --> 49:22.000]  And that's pretty unexpected. It's not something I would have seen coming like
[49:22.000 --> 49:30.620]  half a year ago even. But continuously storing where everyone is, at what point,
[49:30.620 --> 49:36.160]  and who they're meeting with, that's an extremely aggressive move and excessive even. Whereas other
[49:36.160 --> 49:42.600]  protocols and solutions make as little of an impact on privacy as possible and make users
[49:42.600 --> 49:48.580]  take active, explicit choices about their data, the Norwegian app Smittestop did not do that.
[49:51.170 --> 49:56.050]  Now, I think this highlights the need for a drastic improvement in general understanding
[49:56.050 --> 50:00.930]  of privacy, even amongst technologists, so that people will know how to make solutions
[50:00.930 --> 50:06.610]  that are private by design and we don't end up in this situation again. And there are, of course,
[50:06.610 --> 50:11.450]  also discussions to be had on a more principled level, which might illuminate this issue to even
[50:11.450 --> 50:19.450]  more people like decision makers, directors, politicians, and so on. Because privacy is a
[50:19.450 --> 50:26.550]  foundational human right and it's guaranteed by national constitutions and the EU and the
[50:26.550 --> 50:32.150]  Universal Declaration of Human Rights. And that means there is a minimum of how much of these
[50:32.150 --> 50:37.890]  rights that people must always have, meaning that states can only go so far in any case,
[50:37.890 --> 50:41.250]  even though they, of course, might get more leeway in emergencies.
[50:42.090 --> 50:49.370]  And that's because our principles, by virtue of being principles, always apply. Also, you know,
[50:49.370 --> 50:54.090]  privacy is a defining feature of liberal democracies, we should take this stuff seriously.
[50:54.610 --> 51:01.650]  But where then is this line? That's a hard issue and one that I don't really have a good answer to,
[51:01.650 --> 51:10.050]  but that discussion is needed anyways. Because, you know, I experienced, even in working with
[51:10.050 --> 51:14.230]  the expert group, that there are people that honestly believe that privacy costs are a pure
[51:14.230 --> 51:20.610]  balancing act, that any degree of invasiveness can be warranted if it leads to effectiveness,
[51:20.610 --> 51:26.210]  if the results are good enough, in whatever sense you're evaluating that.
[51:27.250 --> 51:31.310]  I tend to think the laws and regulations are pretty clear that this is not the case,
[51:32.090 --> 51:35.830]  and, you know, also protection against the tyranny of majority is another
[51:35.830 --> 51:38.650]  defining feature of liberal democracies after all.
[51:42.690 --> 51:50.990]  So, I think the head of Amnesty Security Lab had a pretty good quote here in that,
[51:50.990 --> 51:54.910]  you know, privacy doesn't need to be a causality in rolling out of these maps.
[51:54.910 --> 52:02.410]  And after all, we have to remember Amnesty, the DPA, Parliament, EU, Google and Apple,
[52:02.410 --> 52:08.310]  our expert group and 300 professionals have all warned the involved parties several times.
[52:08.390 --> 52:13.350]  And the fact that there's been no change until the entire app was put on hold is very strange,
[52:13.350 --> 52:18.010]  given the degree of trust we usually pride ourselves in placing in experts in Norway.
[52:19.230 --> 52:22.870]  You know, when there are privacy preserving alternatives,
[52:22.870 --> 52:28.590]  such as other existing protocols and solutions, they should always be explored first.
[52:29.190 --> 52:36.470]  Of course, it's laudable to want to solve this very real and very big problem with the means
[52:36.470 --> 52:42.010]  available to us, but we really can't excuse bad work and lack of understanding by claiming that
[52:42.010 --> 52:50.330]  the ends justify the means, in my opinion. Now, hopefully, you would rewrite the app in a more
[52:50.330 --> 52:54.950]  privacy preserving way and try to learn from what went wrong here in order not to make this mistake
[52:54.950 --> 53:05.870]  again. But the design choices were questionable or even arguable at first, but they never really
[53:05.870 --> 53:12.890]  changed. And now that it's put on hold, this is the worst that could happen because now we're
[53:12.890 --> 53:17.650]  seeing a bit of a rise again and the government is maybe having to crack down a bit and implement
[53:17.650 --> 53:26.490]  more safety issues, and now we have no app. So this is not a good place to be in.
[53:27.690 --> 53:33.430]  But as for the future, at least there's hope, because only yesterday I read this.
[53:33.990 --> 53:41.090]  But only time will tell where we go from here. And that's all I had to say. With that,
[53:41.090 --> 53:44.890]  I'd like to say thank you. It's been great fun and I appreciate you all being here.
[53:45.630 --> 53:46.970]  And now for some Q&A.
